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ABSTRACT : 


When  applied  to  a  problem  which  has  more  than  one  local  optimal 
solution,  most  nonlinear  programming  algorithms  will  terminate 
with  the  first  local  solution  found.   Several  methods  have  been 
suggested  for  extending  the  search  to  find  the  global  optimum 
of  such  a  nonlinear  program.   In  this  report  we  present  the  re- 
sults of  some  numerical  experiments  designed  to  compare  the  per- 
formance of  various  strategies  for  finding  the  global  solution. 


I.  INTRODUCTION 

It  is  frequently  the  case  in  applied  optimization  studies  that 
an  algorithm  which  is  known  to  converge  to  a  global  optimal  solution 
under  certain  conditions  (such  as  convexity)  will  be  applied  to  a  prob- 
lem which  does  not  satisfy  these  conditions.   In  particular,  optimiza- 
tion problems  which  are  suspected  of  having  several  local  optima  in 
addition  to  the  global  optimum  are  often  solved  using  algorithms  which 
will  stop  and  indicate  a  solution  whenever  any  local  optimum  is  reached, 
In  such  cases  a  useful  strategy  is  to  repeat  the  solution  process  sev- 
eral times  starting  from  different  initial  points  to  avoid  accepting 
a  solution  which  is  only  a  local  optimum.   This  is  probably  the  most 
frequently  suggested  strategy  for  avoiding  local  solutions. 

There  are  also  other  strategies  for  avoiding  the  local  solutions 
in  favor  of  the  global  optimum.  This  paper  describes  some  numerical 
experiments  which  were  done  to  compare  the  performance  of  several  strat- 
egies for  organizing  such  a  global  optimization. 

II.  The  Problem 

In  order  to  develop  and  test  strategies  for  avoiding  local  solu- 
tions it  is  necessary  to  specify  a  class  of  optimization  problems  to 
be  considered.   This  paper  will  concentrate  on  the  "essentially  un- 
constrained" nonlinear  programming  problem 

minimize   f(x)  (1) 

subject  to   x  e    Sc  E 


where  the  local  and  global  optimal  solutions  to  (1)  are  known  to  occur 
in  the  interior  of  the  set   S.   In  such  a  problem  the  feasible  region 
S   determines  a  domain  to  be  searched  for  solutions,  but  the  boundaries 
of   S   do  not  determine  the  solutions.   In  this  sense  problem  (1)  can 
be  considered  "essentially  unconstrained." 

Problems  of  this  type  arise  frequently  as  the  "unconstrained" 
subproblems  in  interior  penalty  function  algorithms  such  as  the  Sequen- 
tial Unconstrained  Minimization  Technique  of  Fiacco  and  McCormick  [3]. 
In  the  SUMT  method,  if  the  original  nonlinear  program  is  not  a  convex 
program,  then  the  subproblem  (1)  may  have  local  solutions  which  are 
distinct  from  the  global  solution. 

For  problems  like  (1)  a  local  optimal  solution  can  be  obtained 
by  applying  any  of  the  efficient  unconstrained  descent  algorithms 
(such  as  the  Davidon-Fletcher-Powell  method)  to  minimize  the  function 
f(x)   while  being  careful  not  to  penetrate  the  boundary  of   S.   We 
shall  now  consider  several  strategies  which  try  to  ensure  that  the 
local  solution  we  finally  accept  is,  in  fact,  a  global  minimum. 

III.   Strategies  For  Avoiding  Local  Solutions 

Six  different  strategies  for  organizing  a  global  optimization 
are  compared  in  this  paper.   These  are  briefly  described  below  with 
references  to  more  complete  descriptions  when  they  exist. 

Strategy  SI  (From  the  folklore) 
a.   Set  k  =  1. 


b.  Let   x   be  a  vector  chosen  at  random  in  the  search 
region   S.   Starting  at   x   perform  an  unconstrained 

minimization  search  on  the  function   f(x)   terminating 

k* 
at  the  local  minimum  x 

c.  Replace   k  with   k  +  1   and  go  to  step  b.   At  each 
stage  retain  the  best  local  solution  obtained  to  date. 

SI   is  the  strategy  suggested  in  section  I.   Intuitively  the  problem 
with  this  strategy  is  that  it  may  repeatedly  search  to  the  same  local 
minimum  if  the  starting  points   x   happen  to  be  chosen  within  the 
"range  of  attraction"  of  that  local  minimum.   The  next  three  strate- 
gies attempt  to  solve  this  problem. 

Strategy   S2 

a.  Set   k  =  1  ,  f*  =  +  °° 

b.  Randomly  select  points   x  e  S   until  one  is  found  with 
f(x)  <  f  .   Call  this  point   x  . 

c.  Starting  at   x   perform  an  unconstrained  minimization 

k* 
search  terminating  at  a  new  local  minimum  x 

*      k* 

d.  Set   f  =  f(x   )  ,  replace  k  with  k  +  1,   and  go  to 

step  b. 

k  k 

In   S2   a  minimization  (step  c.)  is  initiated  at   x   only  if   f(x  ) 

is  smaller  than  the  best  solution  found  to  date.   Hence,  each  succes- 
sive minimization  gives  a  new  local  minimum  which  is  better  than  any 


found  so  far.  The  same  local  minimum  cannot  be  located  twice.  It  is, 
however,  much  more  difficult  to  determine  the  starting  points  x  for 
strategy   S2   than  for   SI. 


Strategy   S3   (Bocharov  [1]) 

a.  Choose  x   randomly  in   S.   Set  k  =  1. 

b.  Starting  from  x   perform  an  unconstrained  minimiza- 

k* 
tion  terminating  at  the  local  minimum  x 

k    n 

c.  Choose  a  direction  d  e  E   at  random  and  consider 

k*     k 
f(x   +  ad  )   as  the  positive  scalar  a   increases. 

k*  k 

Moving  away  from  x    in  direction  d  ,   the  function 

k* 
f  must  initially  increase   (since  x    is  a  local 

minimum) .   Continue  to  increase  a  until   f  begins 

to  decrease  when  a  =  a  . 

k+1    k*    k  k 

d.  Let   x    =  x   +  a  d   ,  replace  k  with  k  +  1  ,  and 

go  to  step  b. 


Strategy   S4   (Bocharov  [1]) 

S4   is  the  same  as   S3  except  that  in  step  c,  instead 
of  choosing  the  direction  at  random,   d   is  chosen 
to  be  the  direction  of  overall  progress  from  the  most 
recent  minimization 

,k    k*    k  ,9v 

d   =  x    -  x  \2.) 

Both   S3  and   S4  attempt  to  prevent  repeated  minimization  to  the 
same  local  optimum  by  moving  out  of  the  region  of  attraction  of  the 


most  recent  local  solution  before  starting  the  next  minimization.   By 
continuing  in  the  direction  (2),  strategy   S4  hopes  to  also  avoid 
local  minima  detected  before  the  most  recent  minimum. 

Strategies   S5   and   S6  are  considerably  different  from  the 
first  four  methods.   While   SI  -  S4  attempt  to  choose  good  starting 
points  for  repeated  local  minimizations,   S5   and   S6  attempt  to 
gain  information  about  the  entire  search  region   S,  gradually  concen- 
trating their  attention  on  portions  of   S  which  are  in  some  sense 
"likely"  to  contain  the  global  minimum.   S5   and   S6  are  most  easily 
described  for  problems  where   S   is  determined  by  lower  and  upper 
bounds  on  each  variable: 

S  =  {x  e  En  I  I.   s  x.  <.   L.  ,  i  =  l,...,n} 
1   1    l    l 

For  ease  of  presentation  we  will  restrict  our  attention  to  such  prob- 
lems . 

Strategy   S5   (Piecewise  Coordinate  Projection  -  Zakharov  [5]) 

a.  Set  up  an  initially  empty  list  of  points,  and  let 
S  =  {x  e  En  |  I.  <>    x.  £  L.  ,  i  =  1,  .  .  .  ,n}   be  the 

"remaining  feasible  region."  Let   S  =  S   initially. 

k  k 

b.  Randomly  choose  N  points   x  e  S  ,  compute   f(x  ) 

for  each,  and  adjoin  them  to  the  list. 

c.  For  each  coordinate  x.   of   x   (i  =  l,...,n)  separate 
the  remaining  feasible  interval   [£.,L.]   into  m 
equal  subintervals . 


k  th 

Let   X. .  =  {x   in  the  list  whose   i    component  is 

in  the  j     sub interval  of  [t,,L,]} 

1   i 

=  {xk  |  (i-l)d±-i±)/mz    x^-1.  <  j(L±-i±)/m} 

for   i  =  l,...,n  and  j  =  l,...,m.   Then  X.n,X._  ...,X. 

xl      1/ ,      lm 

describe  the  projection  of  the  list  of  points   x   into 

the  m  subintervals  of  the   i    coordinate  axis. 

k   i   k 

d.  By  considering   {f(x  )    x  e  X..}   (i  =  l,...,n  ; 

j  =  l,...,m)   select  the  sub interval  set   X    which 
is  considered  most  likely  to  contain  the  global  minimum 
(for  details  see  Zakharov  [5]). 

e.  By  redefining  t  and  L    delete  the  subinterval 
sets   X  .   (j  =  1, . . . ,m  ;  j  4   t)   from  the  remaining 
feasible  region.   Delete  all  points  in  the  list  which 
are  in  a  deleted  subinterval.   Go  to  step  b. 


As  the  remaining  feasible  region  S   gradually  shrinks,  the 
global  minimum  will  be  more  and  more  closely  bracketed.   The  problem 
with  this  method  is  that  the  most  promising  subinterval  must  be  deter- 
mined  on  the  basis  of  the  sample  of  points   x   chosen  so  far.   There 
is  always  a  chance  that  a  subinterval  chosen  for  deletion  will,  in 
fact,  contain  the  global  minimum  solution,  and  once  it  is  deleted 
it  can  never  be  recovered. 

Strategy   S6  attempts  to  solve  this  problem  by  retaining  the 
entire  region   S   throughout  and  using  a  probabilistic  allocation 
device  to  concentrate  attention  on  areas  in  S  which  are  most 


promising.   This  algorithm  is  new  and  is  still  under  development. 
Initial  results  show  some  promise,  but  considerable  improvement  is 
still  necessary. 

Strategy   S6   (Coordinatewise  Allocation) 

a.   Define  a  marginal  probability  distribution  function 

$ .   on  the  feasible  interval   [£.,L.]   of  each  coor- 
1  i  i 


dinate  axis   i  =  l,...,n.   In  the  absence  of  other 
information,  a  uniform  distribution  seems  reasonable 
for  the  initial  distribution. 

b.  Randomly  choose  N  points   x  e  S   and  compute 
f(x  )   for  each.   The  probability  distribution 

functions   $.   govern  these  choices  in  that  the   i 

k       k 
component   x.   of  x   is  chosen  as  a  random  sample 

point  from  the  distribution  $ ,  .   Thus,  the  $ .   deter- 
r  i  1 

mine  the  allocation  of  trial  points  to  various  regions 
in   S. 

c.  Based  on  the  results  of  the  trials  to  date,  modify 

the   0.   to  increase  the  allocation  of  future  points 
l 

to  regions  considered  likely  to  contain  the  global 
minimum.   Go  to  step  b. 


Strategy   S6   can  have  many  realizations  depending  on  the  method  of 
handling  step  c.   In  the  version  of   S6   reported  in  this  paper,  step 
c  is  performed  as  following  for  each  coordinate   i  =  l,...,n  . 


1.  The  feasible  interval   [-t.,L.]   is  split  into  m  sub- 
intervals. 

2.  A  "success"  is  defined  as  a  value  of   f(x  )   in  the 
bottom  25%  of  all   f(x  )   values,  and  the  ratios   r.. 
of  the  number  of  successes  in  subinterval  j   of  coor- 
dinate  i   to  the  total  number  of  points  in  subinterval 
j   are  computed  for  all   i  and  j . 

3.  The  modified  probability  for  subinterval  j   of  coor- 
dinate  i   is  given  by  p . .  =  r . .  /  ) .    ,  r . .   the 
normalized  success  ratio. 

Several  improvements  on  this  allocation  scheme  are  being  considered 
for  future  testing. 

In  early  tests  it  became  apparent  that  performance  of  the  var- 
ious strategies  fluctuated  considerably,  depending  on  the  particular 
test  problem  under  investigation.   For  example,  relative  to  the  other 
strategies,   S2  performed  spectacularly  on  some  problems  but  miserably 
on  others.   On  closer  examination  it  was  found  that   S2   did  well  on 
problems  for  which  the  global   f  value  was  significantly  lower  than 
the  local  minima  and  for  which  the  global  region  of  attraction  was 
quite  large;  that  is,  on  problems  which  were  rather  easy  to  solve. 
This  suggests  the  need  for  a  benchmark  strategy  to  be  used  for  assess- 
ing problem  difficulty.   The  benchmark  strategy  should  have  as  little 
structure  as  possible.   We  have  chosen  to  use  the  pure  random  search 
method  for  this  purpose. 


Strategy   SO   (Pure  Random  -  Brooks  [2]) 

a.  Set  k  =  1. 

k  k 

b.  Randomly  select   x   e  S.   Evaluate   f(x  ). 

c.  Replace  k  with  k  +  1.   Go  to  step  b.   At  each  stage 
retain  the  best   f   value  found  to  date. 

This  strategy  may  be  regarded  as  a  benchmark  method  since  it  makes  no 
attempt  to  take  advantage  of  the  information  gathered  at  previous  stages 
In  this  sense  it  is  probably  the  most  primitive  strategy  possible. 
We  can  use   SO   in  two  ways: 

1.  If  a  strategy  does  not  do  considerably  better  than   SO, 
it  should  be  discarded. 

2.  If  a  test  problem  is  such  that   SO   can  solve  it  nearly 
as  well  as  the  other  strategies,  then  the  problem  is 
not  very  difficult  and  probably  is  not  useful  for  dis- 
criminating among  strategies. 

IV.   Computational  Experiments 

A  number  of  computational  experiments  were  performed  to  compare 
the  various  strategies  presented  above.   For  each  of  the  test  functions 
employed,  each  strategy  was  run  30  times  with  different  random  number 
sequences.   A  run  was  allowed  to  continue  until  the  algorithm  had  re- 
quired 1000  evaluations  of  the  objective  function   f(x). 
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Test  problems  with  predictable  local  and  global  solutions  were 
constructed  using  the  objective  function 

f(x)  =  -  ^"?  c.  expKx-p.)'  A.  (x-p.)] 

This  function  consists  of  the  superposition  of  m  modes,  where  mode 
j   has  depth  c.  e  E  ,   position  p.  e  E  ,   and  shape  and  width  deter- 
mined by  the  n x n  negative  definite  matrix  A..   Particular  test 

functions  were  obtained  by  choosing  the  parameters   c.   and  p.   from 

j        j 

a  random  number  table.   A.   was  chosen  to  ensure  that  the  m  modes 

J 

were  narrow  enough  that  they  did  not  completely  merge  into  one  another, 
Strategies   SI   through   S4   require  an  unconstrained  minimizer. 
Since  the  purpose  of  the  study  is  to  compare  global  strategies,  a  min- 
imizer is  desired  which  uses  the  same  information  as  is  available  to 
the  other  strategies  -  function  values  but  not  derivatives.   Powell's 
derivative  free  method  was  selected  [4]. 


V.   Results 

The  computational  results  obtained  are  summarized  in  Tables  1 
and  2.   Table  1  gives  characteristics  of  the  test  problems  used.   Table 
2  lists  for  each  problem  and  for  each  strategy  the  best   f  value  ob- 
tained after  200,  500,  and  1000  function  evaluations.   Each  value  is 
the  average  of  the  30  trials  conducted  for  that  problem  and  strategy. 
The  percentage  of  the  30  trials  which  did  not  locate  the  global  mini- 
mum after  1000  function  evaluations  is  also  given  in  Table  2.   It  is 
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Number 

of 

Number  of 

Value  of  Global 

Problem 

Variabl 
2 

es 

Minima 
4 

Minimum 

A 

-  9.0 

B 

2 

10 

-  9.9 

C 

2 

10 

-  9.3 

D 

2 

10 

-  9.8 

E 

2 

10 

-13.0 

F 

5 

5 

-  9.4 

G 

5 

5 

-10.1 

H 

5 

10 

-10.0 

I 

5 

10 

-  8.9 

J 

5 

20 

-11.9 

Table  1 
Characteristics  of  Test  Problems 
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Function 


SO 


SI 


S2 


S3 


S4 


S5 


S6 


best  f  after  200   -  8.6  -  8.5  -  9.0  -  8.2  -  8.6 

best  f  after  500   -  8.8  -  8.9  -  9.0  -  8.9  -  9.0 

best  f  after  1000   -  8.9  -  9.0  -  9.0  -  9.0  -  9.0 

%  failures -  0.0  0.0  0.0  0.0 


-  8 


5  -  8.7 

8.7  -  9.0 

8.8  -  9.0 
20.0  0.0 


best  f  after  200   -  9.0  -  8.9  -  9.7  -  9.0  -  9.5  -  9.1  -  9.1 

best  f  after  500   -  9.6  -  9.3  -  9.8  -  9.9  -  9.9  -  9.7  -  9.8 

best  f  after  1000   -  9.7  -  9.8  -  9.9  -  9.9  -  9.9  -  9.8  -  9.9 

%  failures = 3.3  0.0  0.0  0.0  10. 0  0.0 

best  f  after  200   -  7.6  -  8.3  -  8.1  -  8.8  -  7.8  -  7.8  -  7.7 

best  f  after  500   -  8.0  -  8.6  -  8.2  -  9.1  -  8.5  -  8.1  -  8.0 

best  f  after  1000   -  8.3  -  8.9  -  8.6  -  9.2  -  8.7  -  8.2  -  8.2 

%  failures 33.3  53.3  3.3  43.3  83.3  80.0 

best  f  after  200   -  8.6  -  8.9  -  9.2  -  7.8  -  7.4  -  8.8  -  8.8 

best  f  after  500   -  9.1  -  9.5  -  9.5  -  9.4  -  8.5  -  9.2  -  9.4 

best  f  after  1000   -  9.4  -  9.7  -  9.6  -  9.7  -  9.6  -  9.2  -  9.6 

%  failures 10.0  30.0  6.7  33.3  73.3  33.3 

best  f  after  200   -10.2  -10.1  -11.8  -  8.3  -  9.5  -10.9  -10.2 

best  f  after  500   -11.6  -12.1  -12.8  -10.5  -11.2  -12.6  -12.3 

best  f  after  1000   -12.1  -12.7  -12.9  -12.0  -13.0  -12.7  -12.8 

%  failures 10.0  3.3  30.0  0.0  6.7  3.3 

best  f  after  200   -  0.3  -  6.7  -  5.0  -  6.4  -  5.8  -  0.8  -  0.8 

best  f  after  500   -  1.0  -  7.9  -  5.0  -  8.0  -  8.7  -  2.9  -  3.1 

best  f  after  1000   -  1.5  -  8.7  -  5.6  -  8.5  -  8.9  -  7.0  -  7.5 

%  failures 60.0  86.7  43.3  33.3  80.0  76.7 

best  f  after  200   -  4.1  -  7.4  -  7.3  -  7.1  -  7.5  -  5.0  -  4.7 

best  f  after  500   -  5.5  -  9.3  -  8.8  -  9.7  -  9.7  -  8.3  -  8.2 

best  f  after  1000   -  6.1  -10.0  -  9.1  -  9.9  -10.1  -  9.5  -  9.3 

%  failures = 3.3  56.7  10.0  0.0  16.7  40.0 

best  f  after  200   -  3.4  -  7.6  -  7.0  -  6.8  -  7.4  -  3.7  -  3.6 

best  f  after  500   -  4.6  -  8.3  -  7.3  -  8.7  -  9.2  -  6.3  -  7.2 

best  f  after  1000   -  5.2  -  8.9  -  7.7  -  9.2  -  9.7  -  8.2  -  8.9 

%  failures 73.3  93.3  56.7  20.0  60.0  50.0 

best  f  after  200   -  3.9  -  7.6  -  6.3  -  6.5  -  6.7  -  4.2  -  4.2 

best  f  after  500   -  4.7  -  8.0  -  7.4  -  8.0  -  7.8  -  5.8  -  5.3 

best  f  after  1000   -  5.3  -  8.8  -  7.6  -  8.4  -  8.6  -  6.9  -  6.1 

%  failures 10.0  66.7  33.3  36.7  80.0  100.0 

best  f  after  200   -  3.3  -  7.4  -  6.3  -  6.7  -  6.5  -  3.8  -  3.6 

best  f  after  500   -  4.1  -  8.8  -  6.6  -  7.4  -  8.1  -  5.3  -  4.6 

best  f  after  1000   -  4.8  -  9.7  -  7.2  -  8.8  -  8.3  -  7.4  -  6.5 

%  failures           -  43.3  83.3  66.7  76.7  73.3  90.0 


Table  2. 
Test  Results 
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difficult  to  obtain  a  single  measure  of  performance  for  this  kind  of 
problem  since  we  must  balance  speed  of  convergence  against  the  chance 
that  the  global  solution  will  be  missed  entirely. 

From  these  test  results  we  can  draw  some  general  conclusions: 

1.  Test  functions  A  and  B  were  not  very  challenging  since 
SO  did  nearly  as  well  as  most  other  strategies. 

2.  S2   seems  to  make  rapid  initial  progress  but  frequently 
stops  short  of  the  global  solution  -  it  is  not  recom- 
mended. 

3.  In  general,   SI,  S3,  and  S4   perform  about  the  same 
and  better  than  the  other  strategies. 

4.  S5  and  S6   exhibit  slow  initial  convergence.   Both 
frequently  tend  to  concentrate  the  search  effort  around 
a  good  local  minimum  which  is  not  global. 

5.  On  difficult  problems  even  the  best  of  these  methods 
will  frequently  fail  to  locate  the  global  minimum. 

It  is  also  interesting  to  examine  the  entire  graph  of  the  number 
of  function  evaluations  versus  the  best  function  value  obtained  for 
each  strategy.   These  curves  are  shown  for  test  function  H  in  Figure 
1.   The  results  for  function  H  are  representative  of  those  obtained 
for  the  other  functions  and  serve  to  emphasize  conclusions  2,  3,  and  4 
above. 

In  conclusion,  it  is  appropriate  to  note  that  these  six  methods 
do  not  come  near  to  exhausting  the  possible  techniques  for  avoiding 
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local  solutions.   Methods  which  are  hybrids  of  these  and  entirely  new 
methods  should  be  tested.   In  particular,  we  hope  to  develop  an  algor- 
ithm which  allocates  unconstrained  minimizations  to  various  regions 
similar  to  the  way  strategy   S6  allocates  the  individual  points   x  . 
Such  a  method  would  combine  the  rapid  local  optimizing  power  of  the 
minimization  method  with  a  global  analysis  of  the  feasible  region. 
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